Overview

Dataset statistics

Number of variables20
Number of observations36457
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.6 MiB
Average record size in memory160.0 B

Variable types

Numeric10
Categorical10

Alerts

Unnamed: 0 is highly correlated with IDHigh correlation
ID is highly correlated with Unnamed: 0High correlation
CNT_CHILDREN is highly correlated with CNT_FAM_MEMBERSHigh correlation
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERSHigh correlation
CNT_FAM_MEMBERS is highly correlated with CNT_CHILDREN and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with IDHigh correlation
ID is highly correlated with Unnamed: 0High correlation
CNT_CHILDREN is highly correlated with CNT_FAM_MEMBERSHigh correlation
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERSHigh correlation
CNT_FAM_MEMBERS is highly correlated with CNT_CHILDREN and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with IDHigh correlation
ID is highly correlated with Unnamed: 0High correlation
CNT_CHILDREN is highly correlated with CNT_FAM_MEMBERSHigh correlation
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERSHigh correlation
CNT_FAM_MEMBERS is highly correlated with CNT_CHILDREN and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with IDHigh correlation
ID is highly correlated with Unnamed: 0High correlation
CODE_GENDER is highly correlated with FLAG_OWN_CAR and 1 other fieldsHigh correlation
FLAG_OWN_CAR is highly correlated with CODE_GENDERHigh correlation
CNT_CHILDREN is highly correlated with CNT_FAM_MEMBERSHigh correlation
NAME_INCOME_TYPE is highly correlated with OCCUPATION_TYPE and 2 other fieldsHigh correlation
OCCUPATION_TYPE is highly correlated with CODE_GENDER and 1 other fieldsHigh correlation
CNT_FAM_MEMBERS is highly correlated with CNT_CHILDRENHigh correlation
AGE is highly correlated with NAME_INCOME_TYPEHigh correlation
YEARS_EMPLOYED is highly correlated with NAME_INCOME_TYPEHigh correlation
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
ID has unique values Unique
CNT_CHILDREN has 25201 (69.1%) zeros Zeros
OCCUPATION_TYPE has 1241 (3.4%) zeros Zeros
YEARS_EMPLOYED has 6135 (16.8%) zeros Zeros

Reproduction

Analysis started2022-04-25 13:54:03.049401
Analysis finished2022-04-25 13:54:46.702006
Duration43.65 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct36457
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18228
Minimum0
Maximum36456
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:46.910829image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1822.8
Q19114
median18228
Q327342
95-th percentile34633.2
Maximum36456
Range36456
Interquartile range (IQR)18228

Descriptive statistics

Standard deviation10524.37372
Coefficient of variation (CV)0.5773740245
Kurtosis-1.2
Mean18228
Median Absolute Deviation (MAD)9114
Skewness0
Sum664538196
Variance110762442.2
MonotonicityStrictly increasing
2022-04-25T09:54:47.162895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
243071
 
< 0.1%
243011
 
< 0.1%
243021
 
< 0.1%
243031
 
< 0.1%
243041
 
< 0.1%
243051
 
< 0.1%
243061
 
< 0.1%
243081
 
< 0.1%
242991
 
< 0.1%
Other values (36447)36447
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
364561
< 0.1%
364551
< 0.1%
364541
< 0.1%
364531
< 0.1%
364521
< 0.1%
364511
< 0.1%
364501
< 0.1%
364491
< 0.1%
364481
< 0.1%
364471
< 0.1%

ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct36457
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5078226.997
Minimum5008804
Maximum5150487
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:47.674415image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum5008804
5-th percentile5018456.6
Q15042028
median5074614
Q35115396
95-th percentile5146024.2
Maximum5150487
Range141683
Interquartile range (IQR)73368

Descriptive statistics

Standard deviation41875.24079
Coefficient of variation (CV)0.008246035637
Kurtosis-1.212613663
Mean5078226.997
Median Absolute Deviation (MAD)38093
Skewness0.08624228966
Sum1.851369216 × 1011
Variance1753535791
MonotonicityNot monotonic
2022-04-25T09:54:47.896144image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50088041
 
< 0.1%
50969931
 
< 0.1%
50969831
 
< 0.1%
50969871
 
< 0.1%
50969881
 
< 0.1%
50969901
 
< 0.1%
50969911
 
< 0.1%
50969921
 
< 0.1%
50969941
 
< 0.1%
50969781
 
< 0.1%
Other values (36447)36447
> 99.9%
ValueCountFrequency (%)
50088041
< 0.1%
50088051
< 0.1%
50088061
< 0.1%
50088081
< 0.1%
50088091
< 0.1%
50088101
< 0.1%
50088111
< 0.1%
50088121
< 0.1%
50088131
< 0.1%
50088141
< 0.1%
ValueCountFrequency (%)
51504871
< 0.1%
51504851
< 0.1%
51504841
< 0.1%
51504831
< 0.1%
51504821
< 0.1%
51504811
< 0.1%
51504801
< 0.1%
51504791
< 0.1%
51504781
< 0.1%
51504771
< 0.1%

CODE_GENDER
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
24430 
1
12027 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
024430
67.0%
112027
33.0%

Length

2022-04-25T09:54:48.101358image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:48.200548image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
024430
67.0%
112027
33.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

FLAG_OWN_CAR
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
22614 
1
13843 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
022614
62.0%
113843
38.0%

Length

2022-04-25T09:54:48.300756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:48.405235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
022614
62.0%
113843
38.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

FLAG_OWN_REALTY
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
1
24506 
0
11951 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
124506
67.2%
011951
32.8%

Length

2022-04-25T09:54:48.517275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:48.629002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
124506
67.2%
011951
32.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CNT_CHILDREN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4303151658
Minimum0
Maximum19
Zeros25201
Zeros (%)69.1%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:48.734972image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7423669007
Coefficient of variation (CV)1.7251702
Kurtosis22.56243396
Mean0.4303151658
Median Absolute Deviation (MAD)0
Skewness2.569382202
Sum15688
Variance0.5511086153
MonotonicityNot monotonic
2022-04-25T09:54:48.922019image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
025201
69.1%
17492
 
20.6%
23256
 
8.9%
3419
 
1.1%
463
 
0.2%
520
 
0.1%
143
 
< 0.1%
72
 
< 0.1%
191
 
< 0.1%
ValueCountFrequency (%)
025201
69.1%
17492
 
20.6%
23256
 
8.9%
3419
 
1.1%
463
 
0.2%
520
 
0.1%
72
 
< 0.1%
143
 
< 0.1%
191
 
< 0.1%
ValueCountFrequency (%)
191
 
< 0.1%
143
 
< 0.1%
72
 
< 0.1%
520
 
0.1%
463
 
0.2%
3419
 
1.1%
23256
 
8.9%
17492
 
20.6%
025201
69.1%

AMT_INCOME_TOTAL
Real number (ℝ≥0)

Distinct265
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186685.7367
Minimum27000
Maximum1575000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:49.169657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum27000
5-th percentile76500
Q1121500
median157500
Q3225000
95-th percentile360000
Maximum1575000
Range1548000
Interquartile range (IQR)103500

Descriptive statistics

Standard deviation101789.2265
Coefficient of variation (CV)0.5452437251
Kurtosis17.59808418
Mean186685.7367
Median Absolute Deviation (MAD)45000
Skewness2.739009876
Sum6806001902
Variance1.036104663 × 1010
MonotonicityNot monotonic
2022-04-25T09:54:49.434684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1350004309
 
11.8%
1800003097
 
8.5%
1575003089
 
8.5%
1125002956
 
8.1%
2250002926
 
8.0%
2025002192
 
6.0%
900001769
 
4.9%
2700001675
 
4.6%
3150001001
 
2.7%
67500873
 
2.4%
Other values (255)12570
34.5%
ValueCountFrequency (%)
270003
 
< 0.1%
292507
< 0.1%
301503
 
< 0.1%
3150016
< 0.1%
31531.53
 
< 0.1%
319501
 
< 0.1%
324005
 
< 0.1%
3330010
< 0.1%
337501
 
< 0.1%
360005
 
< 0.1%
ValueCountFrequency (%)
15750008
 
< 0.1%
13500006
 
< 0.1%
11250003
 
< 0.1%
9900004
 
< 0.1%
9450004
 
< 0.1%
90000039
0.1%
81000015
 
< 0.1%
7875005
 
< 0.1%
7650009
 
< 0.1%
7425005
 
< 0.1%

NAME_INCOME_TYPE
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
4
18819 
0
8490 
1
6152 
2
2985 
3
 
11

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row0
5th row0

Common Values

ValueCountFrequency (%)
418819
51.6%
08490
23.3%
16152
 
16.9%
22985
 
8.2%
311
 
< 0.1%

Length

2022-04-25T09:54:49.684997image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:49.827801image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
418819
51.6%
08490
23.3%
16152
 
16.9%
22985
 
8.2%
311
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
4
24777 
1
9864 
2
 
1410
3
 
374
0
 
32

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
424777
68.0%
19864
 
27.1%
21410
 
3.9%
3374
 
1.0%
032
 
0.1%

Length

2022-04-25T09:54:50.003681image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:50.130112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
424777
68.0%
19864
 
27.1%
21410
 
3.9%
3374
 
1.0%
032
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

NAME_FAMILY_STATUS
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
1
25048 
3
4829 
0
2945 
2
 
2103
4
 
1532

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row3
5th row3

Common Values

ValueCountFrequency (%)
125048
68.7%
34829
 
13.2%
02945
 
8.1%
22103
 
5.8%
41532
 
4.2%

Length

2022-04-25T09:54:50.286706image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:50.414308image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
125048
68.7%
34829
 
13.2%
02945
 
8.1%
22103
 
5.8%
41532
 
4.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

NAME_HOUSING_TYPE
Real number (ℝ≥0)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.282881203
Minimum0
Maximum5
Zeros168
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:50.556178image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9516747501
Coefficient of variation (CV)0.7418260929
Kurtosis9.496421527
Mean1.282881203
Median Absolute Deviation (MAD)0
Skewness3.290847045
Sum46770
Variance0.9056848299
MonotonicityNot monotonic
2022-04-25T09:54:50.724201image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
132548
89.3%
51776
 
4.9%
21128
 
3.1%
4575
 
1.6%
3262
 
0.7%
0168
 
0.5%
ValueCountFrequency (%)
0168
 
0.5%
132548
89.3%
21128
 
3.1%
3262
 
0.7%
4575
 
1.6%
51776
 
4.9%
ValueCountFrequency (%)
51776
 
4.9%
4575
 
1.6%
3262
 
0.7%
21128
 
3.1%
132548
89.3%
0168
 
0.5%

FLAG_WORK_PHONE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
28235 
1
8222 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
028235
77.4%
18222
 
22.6%

Length

2022-04-25T09:54:50.936016image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:51.058142image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
028235
77.4%
18222
 
22.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

FLAG_PHONE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
25709 
1
10748 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
025709
70.5%
110748
29.5%

Length

2022-04-25T09:54:51.169417image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:51.271048image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
025709
70.5%
110748
29.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

FLAG_EMAIL
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
33186 
1
 
3271

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
033186
91.0%
13271
 
9.0%

Length

2022-04-25T09:54:51.593600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:51.693112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
033186
91.0%
13271
 
9.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

OCCUPATION_TYPE
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct19
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.170529665
Minimum0
Maximum18
Zeros1241
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:51.794901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q16
median10
Q312
95-th percentile15
Maximum18
Range18
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.292975215
Coefficient of variation (CV)0.4681272916
Kurtosis-0.7087343605
Mean9.170529665
Median Absolute Deviation (MAD)2
Skewness-0.437895487
Sum334330
Variance18.42963619
MonotonicityNot monotonic
2022-04-25T09:54:51.974394image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1211323
31.1%
86211
17.0%
33591
 
9.8%
153485
 
9.6%
103012
 
8.3%
42138
 
5.9%
61383
 
3.8%
01241
 
3.4%
111207
 
3.3%
2655
 
1.8%
Other values (9)2211
 
6.1%
ValueCountFrequency (%)
01241
 
3.4%
1551
 
1.5%
2655
 
1.8%
33591
9.8%
42138
 
5.9%
585
 
0.2%
61383
 
3.8%
760
 
0.2%
86211
17.0%
9175
 
0.5%
ValueCountFrequency (%)
18174
 
0.5%
17592
 
1.6%
16151
 
0.4%
153485
 
9.6%
1479
 
0.2%
13344
 
0.9%
1211323
31.1%
111207
 
3.3%
103012
 
8.3%
9175
 
0.5%

CNT_FAM_MEMBERS
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.198452972
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:52.134419image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum20
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9116861437
Coefficient of variation (CV)0.4146944034
Kurtosis8.188695361
Mean2.198452972
Median Absolute Deviation (MAD)0
Skewness1.298595907
Sum80149
Variance0.8311716246
MonotonicityNot monotonic
2022-04-25T09:54:52.277519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
219463
53.4%
16987
 
19.2%
36421
 
17.6%
43106
 
8.5%
5397
 
1.1%
658
 
0.2%
719
 
0.1%
153
 
< 0.1%
92
 
< 0.1%
201
 
< 0.1%
ValueCountFrequency (%)
16987
 
19.2%
219463
53.4%
36421
 
17.6%
43106
 
8.5%
5397
 
1.1%
658
 
0.2%
719
 
0.1%
92
 
< 0.1%
153
 
< 0.1%
201
 
< 0.1%
ValueCountFrequency (%)
201
 
< 0.1%
153
 
< 0.1%
92
 
< 0.1%
719
 
0.1%
658
 
0.2%
5397
 
1.1%
43106
 
8.5%
36421
 
17.6%
219463
53.4%
16987
 
19.2%

AGE
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7183
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.73853914
Minimum20.50418558
Maximum68.86383704
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:52.503041image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20.50418558
5-th percentile27.03409379
Q134.11979712
median42.61004675
Q353.2194364
95-th percentile63.02388139
Maximum68.86383704
Range48.35965146
Interquartile range (IQR)19.09963928

Descriptive statistics

Standard deviation11.50071512
Coefficient of variation (CV)0.2629423696
Kurtosis-1.045643576
Mean43.73853914
Median Absolute Deviation (MAD)9.377331499
Skewness0.1842296496
Sum1594575.921
Variance132.2664484
MonotonicityNot monotonic
2022-04-25T09:54:52.722008image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.7057092254
 
0.1%
42.4895788454
 
0.1%
46.2596767938
 
0.1%
40.1568820737
 
0.1%
41.4519120932
 
0.1%
45.9092246932
 
0.1%
42.9166923332
 
0.1%
38.7030534530
 
0.1%
37.7502618130
 
0.1%
27.8773691529
 
0.1%
Other values (7173)36089
99.0%
ValueCountFrequency (%)
20.504185581
 
< 0.1%
21.095573491
 
< 0.1%
21.144855812
< 0.1%
21.237944654
< 0.1%
21.791001872
< 0.1%
21.848497921
 
< 0.1%
22.015510244
< 0.1%
22.051103031
 
< 0.1%
22.056578852
< 0.1%
22.086695831
 
< 0.1%
ValueCountFrequency (%)
68.863837042
< 0.1%
68.830982163
< 0.1%
68.718727971
 
< 0.1%
68.688610991
 
< 0.1%
68.475054242
< 0.1%
68.365537962
< 0.1%
68.346372621
 
< 0.1%
68.29982823
< 0.1%
68.26149754
< 0.1%
68.212215173
< 0.1%

YEARS_EMPLOYED
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct3640
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.024263792
Minimum0
Maximum43.0207328
Zeros6135
Zeros (%)16.8%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:52.945869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.117066059
median4.249231675
Q38.632620793
95-th percentile19.72661999
Maximum43.0207328
Range43.0207328
Interquartile range (IQR)7.515554734

Descriptive statistics

Standard deviation6.480069439
Coefficient of variation (CV)1.075661635
Kurtosis3.847041793
Mean6.024263792
Median Absolute Deviation (MAD)3.583920272
Skewness1.758767075
Sum219626.5851
Variance41.99129994
MonotonicityNot monotonic
2022-04-25T09:54:53.166440image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06135
 
16.8%
1.0979007178
 
0.2%
4.21363888464
 
0.2%
0.547581401463
 
0.2%
4.59420795861
 
0.2%
5.71401192461
 
0.2%
6.92964263556
 
0.2%
1.25943722354
 
0.1%
3.17597212853
 
0.1%
5.63187471352
 
0.1%
Other values (3630)29780
81.7%
ValueCountFrequency (%)
06135
16.8%
0.046544419123
 
< 0.1%
0.11773000131
 
< 0.1%
0.17796395552
 
< 0.1%
0.18070186251
 
< 0.1%
0.19165349054
 
< 0.1%
0.19439139751
 
< 0.1%
0.199867211517
 
< 0.1%
0.21355674651
 
< 0.1%
0.21629465361
 
< 0.1%
ValueCountFrequency (%)
43.02073281
 
< 0.1%
42.878361644
 
< 0.1%
41.690111
 
< 0.1%
41.265734413
 
< 0.1%
41.1726455716
< 0.1%
40.759221616
 
< 0.1%
40.548402778
< 0.1%
40.452576032
 
< 0.1%
39.798216254
 
< 0.1%
39.625728116
 
< 0.1%

STATUS
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size284.9 KiB
0
27711 
-1
4455 
1
4291 

Length

Max length2
Median length1
Mean length1.122198755
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row-1

Common Values

ValueCountFrequency (%)
027711
76.0%
-14455
 
12.2%
14291
 
11.8%

Length

2022-04-25T09:54:53.567307image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-25T09:54:53.719790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
027711
76.0%
18746
 
24.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

MONTHS_BALANCE
Real number (ℝ≥0)

Distinct61
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.16419343
Minimum0
Maximum60
Zeros315
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size284.9 KiB
2022-04-25T09:54:53.872908image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q112
median24
Q339
95-th percentile55
Maximum60
Range60
Interquartile range (IQR)27

Descriptive statistics

Standard deviation16.50185447
Coefficient of variation (CV)0.6307037329
Kurtosis-1.037761862
Mean26.16419343
Median Absolute Deviation (MAD)14
Skewness0.2863945674
Sum953868
Variance272.3112008
MonotonicityNot monotonic
2022-04-25T09:54:54.194541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7889
 
2.4%
11828
 
2.3%
6824
 
2.3%
8820
 
2.2%
5816
 
2.2%
17807
 
2.2%
3800
 
2.2%
10798
 
2.2%
16785
 
2.2%
15774
 
2.1%
Other values (51)28316
77.7%
ValueCountFrequency (%)
0315
 
0.9%
1551
1.5%
2643
1.8%
3800
2.2%
4765
2.1%
5816
2.2%
6824
2.3%
7889
2.4%
8820
2.2%
9770
2.1%
ValueCountFrequency (%)
60321
0.9%
59307
0.8%
58333
0.9%
57304
0.8%
56345
0.9%
55368
1.0%
54358
1.0%
53377
1.0%
52463
1.3%
51476
1.3%

Interactions

2022-04-25T09:54:43.047009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:17.419994image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:21.362961image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:25.074887image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:27.654428image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:30.118121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:32.269598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:35.814418image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:38.545956image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:40.874729image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:43.256327image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:17.676293image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:21.630282image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:25.343694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:28.084655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:30.321159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:32.502038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:36.185923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:38.771930image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:41.083410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:43.453069image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:17.946037image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:22.041116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:25.600746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:28.305678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:30.566149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:32.712951image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:36.538009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:38.980241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:41.480643image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:43.653349image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:18.214892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:22.337453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:25.871150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:28.531940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:30.764302image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:32.961332image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:36.783785image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:39.173546image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:41.682630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:43.886536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:18.438383image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:22.603295image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:26.134322image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:28.783346image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:30.968423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:33.272311image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:37.152668image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:39.377659image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:41.880694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:44.160260image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:19.498024image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:22.924904image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:26.344700image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:29.026249image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:31.175070image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:33.538479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:37.410439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:39.561895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:42.085817image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:44.430602image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:19.876461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:23.346950image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:26.600509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:29.245744image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:31.390013image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:34.250717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:37.664539image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:39.777963image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:42.297570image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:44.700437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:20.203726image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:24.164586image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:26.827812image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:29.462740image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:31.636461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:34.754848image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:37.948981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:39.982687image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:42.497461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:44.937932image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:20.475199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:24.619854image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:27.167106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:29.706570image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:31.855899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:35.206821image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:38.149485image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:40.183592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:42.678880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:45.153224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:20.732223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:24.838977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:27.433176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:29.905103image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:32.058997image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:35.468786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:38.336140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:40.587000image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-25T09:54:42.856645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-04-25T09:54:54.489484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-25T09:54:55.201941image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-25T09:54:55.627431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-25T09:54:56.035526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-25T09:54:56.533241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-25T09:54:45.652219image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-25T09:54:46.369017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Unnamed: 0IDCODE_GENDERFLAG_OWN_CARFLAG_OWN_REALTYCNT_CHILDRENAMT_INCOME_TOTALNAME_INCOME_TYPENAME_EDUCATION_TYPENAME_FAMILY_STATUSNAME_HOUSING_TYPEFLAG_WORK_PHONEFLAG_PHONEFLAG_EMAILOCCUPATION_TYPECNT_FAM_MEMBERSAGEYEARS_EMPLOYEDSTATUSMONTHS_BALANCE
0050088041110427500.0410410012232.86857412.435574115
1150088051110427500.0410410012232.86857412.435574114
2250088061110112500.0441100017258.7938153.104787029
3350088080010270000.0043101115152.3214038.35335404
4450088090010270000.0043101115152.3214038.353354-126
5550088100010270000.0043101115152.3214038.353354026
6650088110010270000.0043101115152.3214038.353354038
7750088120010283500.0112100012161.5043430.000000020
8850088130010283500.0112100012161.5043430.000000016
9950088140010283500.0112100012161.5043430.000000017

Last rows

Unnamed: 0IDCODE_GENDERFLAG_OWN_CARFLAG_OWN_REALTYCNT_CHILDRENAMT_INCOME_TOTALNAME_INCOME_TYPENAME_EDUCATION_TYPENAME_FAMILY_STATUSNAME_HOUSING_TYPEFLAG_WORK_PHONEFLAG_PHONEFLAG_EMAILOCCUPATION_TYPECNT_FAM_MEMBERSAGEYEARS_EMPLOYEDSTATUSMONTHS_BALANCE
364473644751491451110247500.044111008229.9855589.793493125
364483644851491581110247500.044111008229.9855589.793493128
364493644951491901101450000.041110113326.9601701.374429111
36450364505149729111090000.0441100012252.2967624.711938121
364513645151497750110130500.044110108244.18160525.711685119
364523645251498281110315000.0441100010247.4972116.625735111
364533645351498340010157500.0011101111233.9144543.627727123
364543645451498380010157500.0111101111233.9144543.627727132
364553645551500490010283500.0441100015249.1673341.79332919
364563645651503371010112500.044340008125.1558903.266323113